The search functionality is under construction.

Keyword Search Result

[Keyword] high-level synthesis(66hit)

41-60hit(66hit)

  • High-Level Area/Delay/Power Estimation for Low Power System VLSIs with Gated Clocks

    Shinichi NODA  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER

      Vol:
    E85-A No:4
      Page(s):
    827-834

    At high-level synthesis for system VLSIs, their power consumption is efficiently reduced by applying gated clocks to them. Since using gated clocks causes the reduction of power consumption and the increase of area/delay, estimating trade-off between power and area/delay by applying gated clocks is very important. In this paper, we discuss the amount of variance of area, delay and power by applying gated clocks. We propose a simple gate-level circuit model and estimation equations. We vary parameters in our proposed circuit model, and evaluate power consumption by back-annotating gate-level simulation results to the original circuit. This paper also proposes a conditional expression for applying gated clocks. The expression shows whether or not we can reduce power consumption by applying gated clocks. We confirm the accuracy of proposed estimation equations by experiments.

  • An RTL Design-Space Exploration Method for High-Level Applications

    Peng-Cheng KAO  Chih-Kuang HSIEH  Ching-Feng SU  Allen C.-H. WU  

     
    PAPER-High Level Synthesis

      Vol:
    E84-A No:11
      Page(s):
    2648-2654

    In this paper, we present an RTL design-space exploration method for high-level applications. We formulate the RTL design-space exploration into a performance-driven module selection problem. We devise a dynamic-programming algorithm to solve the problem. We present an exploration flow by integrating commercial synthesis and layout tools with our proposed method. Experimental results have demonstrated that generating AT-curve for all modules is the most time consuming task in the design-space exploration process. Using the proposed 3-point AT projection approach, our method can achieve on an average of 80% speed-up in run time and 90% accuracy in design estimation.

  • An Area/Time Optimizing Algorithm in High-Level Synthesis of Control-Based Hardwares

    Nozomu TOGAWA  Masayuki IENAGA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER

      Vol:
    E84-A No:5
      Page(s):
    1166-1176

    This paper proposes an area/time optimizing algorithm in a high-level synthesis system for control-based hardwares. Given a call graph whose node corresponds to a control flow of an application program, the algorithm generates a set of state-transition graphs which represents the input call graph under area and timing constraint. In the algorithm, first state-transition graphs which satisfy only timing constraint are generated and second they are transformed so that they can satisfy area constraint. Since the algorithm is directly applied to control-flow graphs, it can deal with control flows such as bit-wise processes and conditional branches. Further, the algorithm synthesizes more than one hardware architecture candidates from a single call graph for an application program. Designers of an application program can select several good hardware architectures among candidates depending on multiple design criteria. Experimental results for several control-based hardwares demonstrate effectiveness and efficiency of the algorithm.

  • CAM Processor Synthesis Based on Behavioral Descriptions

    Nozomu TOGAWA  Tatsuhiko WAKUI  Tatsuhiko YODEN  Makoto TERAJIMA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-Co-design and High-level Synthesis

      Vol:
    E83-A No:12
      Page(s):
    2464-2473

    CAM (Content Addressable Memory) units are generally designed so that they can be applied to variety of application programs. However, if a particular application runs on CAM units, some functions in CAM units may be often used and other functions may never be used. We consider that appropriate design for CAM units is required depending on the requirements for a given application program. This paper proposes a CAM processor synthesis system based on behavioral descriptions. The input of the system is an application program written in C including CAM functions, and its output is hardware descriptions of a synthesized processor and a binary code executed on it. Since the system determines functions in CAM units and synthesizes a CAM processor depending on the requirements of an application program, we expect that a synthesized CAM processor can execute the application program with small processor area and delay. Experimental results demonstrate its efficiency and effectiveness.

  • Hardware Synthesis from C Programs with Estimation of Bit Length of Variables

    Osamu OGAWA  Kazuyoshi TAKAGI  Yasufumi ITOH  Shinji KIMURA  Katsumasa WATANABE  

     
    PAPER

      Vol:
    E82-A No:11
      Page(s):
    2338-2346

    In the hardware synthesis methods with high level languages such as C language, optimization quality of the compilers has a great influence on the area and speed of the synthesized circuits. Among hardware-oriented optimization methods required in such compilers, minimization of the bit length of the data-paths is one of the most important issues. In this paper, we propose an estimation algorithm of the necessary bit length of variables for this aim. The algorithm analyzes the control/data-flow graph translated from C programs and decides the bit length of each variable. On several experiments, the bit length of variables can be reduced by half with respect to the declared length. This method is effective not only for reducing the circuit area but also for reducing the delay of the operation units such as adders.

  • A Stepwise Refinement Synthesis of Digital Systems for Testability Enhancement

    Taewhan KIM  Ki-Seok CHUNG  C. L. LIU  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E82-A No:6
      Page(s):
    1070-1081

    This paper presents a new data path synthesis algorithm which takes into account simultaneously three important design criteria: testability, design area, and total execution time. We define a goodness measure on the testability of a circuit based on three rules of thumb introduced in prior work on synthesis for testability. We then develop a stepwise refinement synthesis algorithm which carries out the scheduling and allocation tasks in an integrated fashion. Experimental results for benchmark and other circuit examples show that we were able to enhance the testability of circuits significantly with very little overheads on design area and execution time.

  • Module Selection Using Manufacturing Information

    Hiroyuki TOMIYAMA  Hiroto YASUURA  

     
    PAPER-High-level Synthesis

      Vol:
    E81-A No:12
      Page(s):
    2576-2584

    Since manufacturing processes inherently fluctuate, LSI chips which are produced from the same design have different propagation delays. However, the difference in delays caused by the process fluctuation has rarely been considered in most of existing high-level synthesis systems. This paper presents a new approach to module selection in high-level synthesis, which exploits the difference in functional unit delays. First, a module library model which assumes the probabilistic nature of functional unit delays is presented. Then, we propose a module selection problem and an algorithm which minimizes the cost per faultless chip. Experimental results demonstrate that the proposed algorithm finds optimal module selections which would not have been explored without manufacturing information.

  • A High-Level Synthesis System for Digital Signal Processing Based on Data-Flow Graph Enumeration

    Nozomu TOGAWA  Takafumi HISAKI  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-High-level Synthesis

      Vol:
    E81-A No:12
      Page(s):
    2563-2575

    This paper proposes a high-level synthesis system for datapath design of digital processing hardwares. The system consists of four phases: (1) DFG (data-flow graph) generation, (2) scheduling, (3) resource binding, and (4) HDL (hardware description language) generation. In (1), the system does not generate only one best DFG representing a given behavioral description of a hardware, but more than one good DFGs representing it. In (2) and (3), several synthesis tools can be incorporated into the system depending on the required objectives. Thus we can obtain more than one datapath candidates for a behavioral description with their area and performance evaluation. In (4), the best datapath design is selected among those candidates and its hardware description is generated. The experimental results for applying the system to several benchmarks show the effectiveness and efficiency.

  • High-Level Synthesis for Weakly Testable Data Paths

    Michiko INOUE  Kenji NODA  Takeshi HIGASHIMURA  Toshimitsu MASUZAWA  Hideo FUJIWARA  

     
    PAPER-Test Synthesis

      Vol:
    E81-D No:7
      Page(s):
    645-653

    We present a high-level synthesis scheme that considers weak testability of generated register-transfer level (RTL) data paths, as well as their area and performance. The weak testability, proposed in our previous work, is a testability measure of RTL data paths for non-scan design. In our scheme, we first extract a condition on resource sharing sufficient for weak testability from a data flow graph before synthesis, and treat the condition as design objectives in the following synthesis tasks. We propose heuristic synthesis algorithms which optimize area and the design objectives under the performance constraint.

  • A Fast Scheduling Algorithm Based on Gradual Time-Frame Reduction for Datapath Synthesis

    Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E81-A No:6
      Page(s):
    1231-1241

    This paper proposes a fast scheduling algorithm based on gradual time-frame reduction for datapath synthesis of digital signal processing hardwares. The objective of the algorithm is to minimize the costs for functional units and registers and to maximize connectivity under given computation time and initiation interval. Incorporating the connectivity in a scheduling stage can reduce multiplexer counts in resource binding. The algorithm maximizes connectivity with maintaining low time complexity and obtains datapath designs with totally small hardware costs in the high-level synthesis environment. The algorithm also resolves inter-iteration data dependencies and thus realizes pipelined datapaths. The experimental results demonstrate that the proposed algorithm reduces the multiplexer counts after resource binding with maintaining low costs for functional units and registers compared with eight conventional schedulers.

  • Bit and Word-Level Common Subexpression Elimination for the Synthesis of Linear Computations

    Akihiro MATSUURA  Akira NAGOYA  

     
    PAPER

      Vol:
    E81-A No:3
      Page(s):
    455-461

    In this paper, we propose a transformation technique for the multiplications of one variable with multiple constants, which are frequently seen in the various applications of signal processing, image processing, and so forth. The method is based on the exploration of common subexpressions among constants and reduces the number of shifts, additions, and subtractions to implement linear computations with hardware. Our method searches for regularity among elements of a linear transform using matrix decomposition and generates a reduced data-flow graph which preserves the full regularity. We show experimental results obtained using Discrete Cosine Transform (DCT) and Fast Fourier Transform (FFT) and illustrate the effectiveness of the method.

  • An Overlapped Scheduling Method for an Iterative Processing Algorithm with Conditional Operations

    Kazuhito ITO  Tatsuya KAWASAKI  

     
    PAPER

      Vol:
    E81-A No:3
      Page(s):
    429-438

    One of the ways to execute a processing algorithm in high speed is parallel processing on multiple computing resources such as processors and functional units. To identify the minimum number of computing resources, the most important is the scheduling to determine when each operation in the processing algorithm is executed. Among feasible schedules satisfying all the data dependencies in the processing algorithm, an overlapped schedule can achieve the fastest execution speed for an iterative processing algorithm. In the case of processing algorithms with operations which are executed on some conditions, computing resources can be shared by those conditional operations. In this paper, we propose a scheduling method which derives an overlapped schedule where the required number of computing resources is minimized by considering the sharing by conditional operations.

  • A Hierarchical Clustering Method for the Multiple Constant Multiplication Problem

    Akihiro MATSUURA  Mitsuteru YUKISHITA  Akira NAGOYA  

     
    PAPER

      Vol:
    E80-A No:10
      Page(s):
    1767-1773

    In this paper, we propose an efficient solution for the Multiple Constant Multiplication (MCM) problem. The method uses hierarchical clustering to exploit common subexpressions among constants and reduces the number of shifts, additions, and subtractions. The algorithm defines appropriate weights, which indicate operation priority, and selects common subexpressions, resulting in a minimum number of local operations. It can also be extended to various high-level synthesis tasks such as arbitrary linear transforms. Experimental results for several error-correcting codes, digital filters and Discrete Cosine Transforms (DCTs) have shown the effectiveness of our method.

  • An Optimal Block Terminal Assignment Algorithm for VLSI Data Path Allocation

    Shoichiro YAMADA  

     
    LETTER

      Vol:
    E80-A No:3
      Page(s):
    564-566

    This paper presents an efficient optimal block terminal assignment algorithm based on the integer programming for a data path synthesis. The problem is to assign buses to commutable terminals on functional units such that the number of buses is minimum, when the scheduling and allocation of operations and registers have been done. Three methods are used in the algorithm to decrease the amount of computation.

  • A Floorplan Based Methodology for Data-Path Synthesis of Sub-Micron ASICs

    Vasily G. MOSHNYAGA  Keikichi TAMARU  

     
    PAPER-High-Level Synthesis

      Vol:
    E79-D No:10
      Page(s):
    1389-1395

    As IC fabrication technology enters a deepsubmicron region with device feature sizes <0.35µm, interconnect becomes the most dominant factor in design of high-speed Application Specific Integrated Circuits (ASICs). This paper proposes a novel methodology for automated data-path synthesis of such circuits and outlines algorithms to support it. In contrast to other approaches, we formulate interconnect area/delay optimizations as high-level synthesis transformations and use them during the synthesis to minimize the impact of wiring on circuit characteristics. Experiments with FIR filter implementations show that such formulation jointly with on the fly" module generation and performance-driven floorplanning provides more than a 30% reduction in wiring delay for deep sub-micron designs.

  • Reclocking Controllers for Minimum Execution Time

    Pradip JHA  Sri PARAMESWARAN  Nikil DUTT  

     
    PAPER

      Vol:
    E78-A No:12
      Page(s):
    1715-1721

    In this paper we describe a method for resynthesizing the controller of a design for a fixed datapath with the objective of increasing the design's throughput by minimizing its total execution time. This work has tremendous potential in two important areas: one, design reuse for retargetting datapaths to new libraries, new technologies and different bit-widths; and two, back-annotation of physical design information during High-Level Synthesis (HLS), and subsequent adjustment of the design's schedule to account for realistic physical design information with minimal changes to the datapath. We present our approach using various formulations, prove optimality of our algorithm and demonstrate the effectiveness of our technique on several HLS benchmarks. We have observed improvements of up to 34% in execution time after straightforward application of our controller resynthesis technique to the outputs of HLS.

  • High-Level Synthesis --A Tutorial

    Allen C.-H. WU  Youn-Long LIN  

     
    INVITED PAPER-High-Level Synthesis

      Vol:
    E78-D No:3
      Page(s):
    209-218

    We give a tutorial on high-level synthesis of VLSI. The evolution of digital system synthesis techniques and the need for higher level design automation tools are first discussed. We then point out essential issues to the successful development and acceptance by the designers of a high-level synthesis system. Techniques that have been proposed for various subtasks of high-level synthesis are surveyed. Possible applications of the high level synthesis in area other than chip design are forecast. Finally, we point out several directions for possible future research.

  • High-Level Synthesis of a Multithreaded Processor for Image Generation

    Takao ONOYE  Toshihiro MASAKI  Isao SHIRAKAWA  Hiroaki HIRATA  Kozo KIMURA  Shigeo ASAHARA  Takayuki SAGISHIMA  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E78-A No:3
      Page(s):
    322-330

    The design procedure of a multithreaded processor dedicated to the image generation is described, which can be achieved by means of a high-level synthesis tool PARTHENON. The processor employs a multithreaded architecture which is a novel promising approach to the parallel image generation. This paper puts special stress on the high-level synthesis scheme which can simplify the behavioral description for the structure and control of a complex hardware, and therefore enables the design of a complicated mechanism for a multithreaded processor. Implementation results of the synthesis are also shown to demonstrate the performance of the designed processor. This processor greatly improves the throughput of the image generation so far attained by the conventional approach.

  • An Optimal Scheduling Approach Using Lower Bound in High-Level Synthesis

    Seong Yong OHM  Fadi J. KURDAHI  Chu Shik JHON  

     
    PAPER-High-Level Synthesis

      Vol:
    E78-D No:3
      Page(s):
    231-236

    This paper describes an optimal scheduling approach which finds the scheduling result of the minimum functional unit cost under the given timing constraint. In this method, a well-defined search space is constructed incrementally and traversed in a branch-and-bound manner. During the traversal, tighter lower bounds are estimated and utilized coupled with the upper bound on the optimal solution in pruning the search space effectively. This method is extended to support multi-cycling operations, operation chaining, pipelined functional units, and pipelined data paths. Experimental results on some benchmarks show the efficiency of the proposed approach.

  • Datapath Scheduling for Behavioral Description with Conditional Branches

    Akihisa YAMADA  Toshiki YAMAZAKI  Nagisa ISHIURA  Isao SHIRAKAWA  Takashi KAMBE  

     
    PAPER

      Vol:
    E77-A No:12
      Page(s):
    1999-2009

    A new approach is described for the datapath scheduling of behavioral descriptions containing nested conditional branches of arbitrary structures. This paper first investigates such a complex scheduling mechanism, and formulates an optimal scheduling problem as a 0-1 integer programming problem such that given a prescribed number of control steps, the total cost of functional units can be minimized. In this formulation, each constraint is expressed in the form of a Boolean function, which is set equal to 1 or 0 according as the constraint is satisfied or not, respectively, and a satisfiability problem is defined by the product of the Boolean functions. A procedure is then described, which intends to seek an optimal solution by means of a branch-and-bound method on a binary decision diagram representing the satisfiability problem. Experimental results are also shown, which demonstrate that our approach is of more practical use than the existing methods.

41-60hit(66hit)